Searching for Dependencies in Bayesian Classi ersMichael

نویسنده

  • Michael J. Pazzani
چکیده

Naive Bayesian classiiers which make independence assumptions perform remarkably well on some data sets but poorly on others. We explore ways to improve the Bayesian classiier by searching for dependencies among attributes. We propose and evaluate two algorithms for detecting dependencies among attributes and show that the backward sequential elimination and joining algorithm provides the most improvement over the naive Bayesian classiier. The domains on which the most improvement occurs are those domains on which the naive Bayesian classiier is signiicantly less accurate than a decision tree learner. This suggests that the attributes used in some common databases are not independent conditioned on the class and that the violations of the independence assumption that aaect the accuracy of the classiier can be detected from training data. The Bayesian classiier (Duda & Hart, 1973) is a probabilistic method for classiication. It can be used to determine the probability that an example j belongs to class C i given values of attributes of an example represented as a set of n nominally-valued attribute-value pairs of the form A 1 = V 1 j : If the attributes are independent, this probability is proportional to Equation 23.1 (23:1) Equation 23.1 is well suited for learning from data, since the probabilities ^ P(C i) and ^ P(A k = V k j jC i) may be estimated from the training data. To determine the most likely class of a test example, the probability of each class is computed with Equation 1. A classiier created in this manner is sometimes called a simple (Langley, 1993) or naive (Kononenko, 1990) Bayesian classiier. One important evaluation metric for machine learning methods is the predictive accuracy on unseen examples. This is measured by randomly selecting a subset of the examples in a database to use as training examples and reserving the remainder to be used as test examples. In the case of the simple Bayesian classiier, the training examples are used to estimate probabilities and Equation 23.1 is then used 1996 Springer-Verlag.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Searching for Dependencies in Bayesian Classifiers

Naive Bayesian classi ers which make independence assumptions perform remarkably well on some data sets but poorly on others. We explore ways to improve the Bayesian classi er by searching for dependencies among attributes. We propose and evaluate two algorithms for detecting dependencies among attributes and show that the backward sequential elimination and joining algorithm provides the most ...

متن کامل

Visualizing the Simple Bayesian Classi

The simple Bayesian classi er (SBC), sometimes called Naive-Bayes, is built based on a conditional independence model of each attribute given the class. The model was previously shown to be surprisingly robust to obvious violations of this independence assumption, yielding accurate classi cation models even when there are clear conditional dependencies. The SBC can serve as an excellent tool fo...

متن کامل

Inductive and Bayesian learning in medical diagnosis

Although successful in medical diagnostic problems inductive learning systems were not widely accepted in medical practice In this paper two di erent approaches to machine learning in medical appli cations are compared the system for inductive learning of decision trees Assistant and the naive Bayesian classi er Both methodologies were tested in four medical diagnostic problems localization of ...

متن کامل

A Comparison of Event Models for Naive Bayes Text Classi cation

Recent approaches to text classi cation have used two di erent rst order probabilistic models for classi ca tion both of which make the naive Bayes assumption Some use a multi variate Bernoulli model that is a Bayesian Network with no dependencies between words and binary word features e g Larkey and Croft Koller and Sahami Others use a multinomial model that is a uni gram language model with i...

متن کامل

A Bayesian Networks Approach to Reliability Analysis of a Launch Vehicle Liquid Propellant Engine

This paper presents an extension of Bayesian networks (BN) applied to reliability analysis of an open gas generator cycle Liquid propellant engine (OGLE) of launch vehicles. There are several methods for system reliability analysis such as RBD, FTA, FMEA, Markov Chains, and etc. But for complex systems such as LV, they are not all efficiently applicable due to failure dependencies between compo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996